Exchangeability Characterizes Optimality of Sequential Normalized Maximum Likelihood and Bayesian Prediction with Jeffreys Prior
نویسندگان
چکیده
We study online prediction of individual sequences under logarithmic loss with parametric constant experts. The optimal strategy, normalized maximum likelihood (NML), is computationally demanding and requires the length of the game to be known. We consider two simpler strategies: sequential normalized maximum likelihood (SNML), which computes the NML forecasts at each round as if it were the last round, and Bayesian prediction. Under appropriate conditions, both are known to achieve near-optimal regret. In this paper, we investigate when these strategies are optimal. We show that SNML is optimal iff the joint distribution on sequences defined by SNML is exchangeable. This property also characterizes the optimality of a Bayesian prediction strategy for an exponential family. The optimal prior distribution is Jeffreys prior.
منابع مشابه
Horizon-Independent Optimal Prediction with Log-Loss in Exponential Families
We study online learning under logarithmic loss with regular parametric models. Hedayati and Bartlett (2012b) showed that a Bayesian prediction strategy with Jeffreys prior and sequential normalized maximum likelihood (SNML) coincide and are optimal if and only if the latter is exchangeable, and if and only if the optimal strategy can be calculated without knowing the time horizon in advance. T...
متن کاملCommentary on "The Optimality of Jeffreys Prior for Online Density Estimation and the Asymptotic Normality of Maximum Likelihood Estimators"
In the field of prediction with expert advice, a standard goal is to sequentially predict data as well as the best expert in some reference set of ‘expert predictors’. Universal data compression, a subfield of information theory, can be thought of as a special case. Here, the set of expert predictors is a statistical model, i.e. a family of probability distributions, and the predictions are sco...
متن کاملBayesin estimation and prediction whit multiply type-II censored sample of sequential order statistics from one-and-two-parameter exponential distribution
In this article introduce the sequential order statistics. Therefore based on multiply Type-II censored sample of sequential order statistics, Bayesian estimators are derived for the parameters of one- and two- parameter exponential distributions under the assumption that the prior distribution is given by an inverse gamma distribution and the Bayes estimator with respect to squared error loss ...
متن کاملLaplace's Rule of Succession in Information Geometry
When observing data x1, . . . , xt modelled by a probabilistic distribution pθ(x), the maximum likelihood (ML) estimator θML = argmaxθ ∑︀t i=1 ln pθ(xi) cannot, in general, safely be used to predict xt+1. For instance, for a Bernoulli process, if only “tails” have been observed so far, the probability of “heads” is estimated to 0. Laplace’s famous “add-one” rule of succession (e.g., [Grü07]) re...
متن کاملMaximum Likelihood vs. Sequential Normalized Maximum Likelihood in On-line Density Estimation
The paper considers sequential prediction of individual sequences with log loss (online density estimation) using an exponential family of distributions. We first analyze the regret of the maximum likelihood (“follow the leader”) strategy. We find that this strategy is (1) suboptimal and (2) requires an additional assumption about boundedness of the data sequence. We then show that both problem...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012